Parsing Agglutinative Word Structures And Its Application To Spelling Checking For Turkish
نویسندگان
چکیده
Most of tile research on parsing natnral [allguages has beetl concerned with I",nglish, or wil, h other languages nlOrl)hologically similar Io English. Parsing agglntinat.ive word st, ructures ha.s altracted relatively little attcnl;ion most probal~ly becanse agghlfinatiw? lallgllages COlll~aill word s/ructtlres of considerable complexity, and parsing WOrdS ill Stlch languages I'(?(llliros morphok~gical analysis techniques. Ill this pal)er, we pi'eSell(r the design and implementation of a morphological root-driven parser tor Turkish word structures which has been mcorporatoed into a spelling checking kerllel for on-line Tiirkish texl, The agghltmative Ilatllre of the language and the resulting ('Olll[)l<?x Wol'd ['ornlatiollS, V;ll'iOllS pholleLic llall/lOlly l'tlleS alld sill)tie eKcepLiOllS [)reselll, cel'taill difficulties llOl usually on('ountered in the spelling checking of laagua,ges like English and make this a very challenging probhnH.
منابع مشابه
Parsing Turkish with the Lexical Functional Grammar Formalism
This paper describes our work on parsing Turk-ish using the lexical-functional grammar formalism. This work represents the first effort for parsing Turkish. Our implementation is based on Tomita's parser developed at Carnegie-Mellon University Center for Machine Translation. The grammar covers a substantial subset of Turkish including simple and complex sentences, and deals with a reasonable am...
متن کاملRepresentation of Morphosyntactic Units and Coordination Structures in the Turkish Dependency Treebank
This paper presents our preliminary conclusions as part of an ongoing effort to construct a new dependency representation framework for Turkish. We aim for this new framework to accommodate the highly agglutinative morphology of Turkish as well as to allow the annotation of unedited web data, and shape our decisions around these considerations. In this paper, we firstly describe a novel syntact...
متن کاملSpelling Correction in Agglutinative Languages
Spelling correction is an important component of any system for processing text. Agglutinative languages such as Turkish or Finnish, differ from languages like English in the way lexical forms are generated. Typical nominal or a verbal root may generate thousands (or even millions) of valid forms which never appear in the dictionary. For instance, we can give the following (rather exaggerated) ...
متن کاملSpell-Checking based on Syllabification and Character-level Graphs for a Peruvian Agglutinative Language
There are several native languages in Peru which are mostly agglutinative. These languages are transmitted from generation to generation mainly in oral form, causing different forms of writing across different communities. For this reason, there are recent efforts to standardize the spelling in the written texts, and it would be beneficial to support these tasks with an automatic tool such as a...
متن کاملError-tolerant Finite State Recognition with Applications to Morphological Analysis and Spelling Correction
This paper presents the notion of error-tolerant recognition with finite-state recognizers along with results from some applications. Error-tolerant recognition enables the recognition of strings that deviate mildly from any string in the regular set recognized by the underlying finite-state recognizer. Such recognition has applications to error-tolerant morphological processing, spelling corre...
متن کامل